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(m) Ubiquitin conjugating enzyme (E2) fusion proteins. 



(57) A novel class of fusion proteins based on the ubiquitin carrier prate in, or E2, is described. The fusion 
proteins include, in addition to the E2 activity, a protein blinding ligand have specific affinity for a target 
protein. It has been discovered that under cytosolic conditions, such E2 fusions will add a ubiquitin 
moiety to a target protein. Since ubiquitin addition triggers the endogenous cellular protein degra- 
dation pathway, such E2 fusion proteins can be used to selectively target proteins in a host for 
degradation. Thus, E2 fusion proteins genes can be introduced into transgenic organisms to defeat or 
inhibit natural activities or traits. The E2 fusion proteins can also be used by introduction into hosts for 
similar effects. 
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technology of antisense. In antisense technology, a gene is introduced into a living ceil which is designed to 
produce an antisense RN A transcript. The antisense RNA transcript is intended to hybridize under irvvivo con- 
ditions with an mRNA transcript natively present in the cells. The hybridization of those two RNA molecules 
creates a double stranded complex which is then degraded by yet uncharacterized cellular mechanisms. The 
5 net result is that the level of expression of the gene creating the target RNA is dramatically reduced or, in some 
cases, practically eliminated. It has also been proposed that free antisense RNA molecules might be delivered 
into the blood stream of vertebrates in order to suppress the expression of unwanted proteins. 

There are no known prior attempts to use the ubiquitin protein degradation pathway to artificially induce 
ubiquitination and degradation of targeted proteins. 

10 

Summary of the Invention 

The present invention is summarized in that synthetic E2 molecules are created which include a body por- 
tion, a tail portion which is either natural or synthetic, and a heterologous protein binding ligand attached to 
15 the carboxyl end of the tail portion. It has been demonstrated that such molecules are capable of adding a 
ubiquitin particle to target proteins recognized by the protein binding ligand. 

It is a further object of the present invention to describe gene sequences which are capable of causing 
the expression of such synthetic E2 protein recognition molecules in heterologous hosts. 

It is also a feature of the present invention in that indigenous proteins in transgenic organisms can be tar- 
20 geted for degradation through the use of such synthetic E2 protein recognition molecules. 

Other objects, advantages, and features of the present invention will become apparent from the following 
specification when taken in conjunction with the accompanying drawings and sequence listings. 

Brief Description of the Drawings 

25 

Fig. 1 is a schematic illustration of the ubiqu it in-directed protein degradation cycle as it occurs in eukaryotic 
cells. 

Fig. 2 is a schematic illustration of a prototype molecule constructed in accordance with the present in- 
vention. 

30 Fig. 3 is a schematic illustration of a synthetic molecule constructed of UBC4 and a c-myc ligand, and its 

use to tag a target antibody molecule with ubiquitin. 

Fig. 4 is a schematic illustration of a molecule composed of UBC4and TGFa protein, and its use in directing 
ubiquitin addition to a target protein. 

Fig. 5 is a schematic illustration of a molecule composed of UBC4 and gene V, and its use in directing 
35 ubiquitin addition to a target protein. 

Fig. 6 is an illustration of a molecule composed of UBC4 and protein A and its use in directing ubiquitin 
addition to a target protein. 

Fig. 7 is a schematic illustration of a proposed molecule consisting of UBC4 and an antibody, and its use 
in directing ubiquitin attachment to a protein. 

40 

Description of the Preferred Embodiment 

It is proposed, in accordance with the present invention, that a novel class of E2 derived fusion proteins 
be constructed. This novel class of E2 fusion proteins is illustrated, as a prototype, in Fig. 2. This class of E2 

45 proteins, are constructed so as to add a ubiquitin moiety with specificity to a target protein, and are composed 
of three main constituents as illustrated in Fig. 2. One constituent, designated at 24, is a natural or artificial 
E2 protein core region. The second element, indicated at 26, is a spacer which can be either an artificially con- 
structed spacer or a tail region from a native E2 isoform. The third constituent, designated at 28 in Fig. 2, is 
a protein binding ligand, again either natural or artificial, which has unique specificity to a target protein of 

50 interest 

The function of the E2 fusion protein molecule described above, is to add one or more ubiquitin residues 
to a target protein of interest, such a protein being indicated at 30 in Fig. 2. Since the pathway of ubiquitin- 
directed protein degradation in vivo in cells is actuated by the addition of ubiquitin to proteins to be degraded, 
all indications are that the addition of such ubiquitin protein tags to specific proteins will cause their degrada- 
55 tion. In this manner, it becomes possible to target specific proteins for degradation through the use of novel 
E2 proteins such as those described above. 

As also described briefly above, each of the class of proteins called ubiquitin conjugating enzymes, or E2s, 
consists of one or both of two main constituent domains. All known E2 isoforms include a core region, which 

3 
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E2 core region must a.so be capable of transferring ^^^J, any critically. Since ubiquitin 
I Fig- 2 and below. It is not believed that the source of ^^^ rttheC l»inlc-in1^n«^ 
proteinsare extraordinarily highly conserve amo* SiSSi he ubiquitin. the core regions from E2 md- 
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35 affinity to the protein which is to be s.gnaled for degradation. * P » ubci ^ £2 f us , on protein 
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40 core region of an UBC1 type E2, and the rT^M^r^t^ of E3. According.y. it is a requ.rement 
E2 molecule in a function E2 fusion pro f^^^ 
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ubiquitin is to be added. protein-binding ligand 28, intended to have spe- 

50 Also shown in Fig. 2 as a part of the E2 f u«o P ote m ,sa ^ pQss|ble 0 

cif ic aff inity for a targeted protein. The protein ^^^"J^, tne target protein of interest There 
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form, forms a dimer with other similar proteins in vivo , to target ubiquitin addition to the associated proteins in 
vivo . This demonstrates that native protein-to- protein associations may be used to identify protein-binding lig- 
ands. It is possible to use as the protein binding ligand an epitope which will be specifically bound by a particular 
antibody. In most situations, however, what will be ultimately desired for the protein binding ligand would be 

5 an antibody, or at least the recognition portion of an antibody, such as a single chain fragment antibody. For 
proteins which are not known to have another protein having specific attraction for them, the ability to create 
and utilize antibodies recognition domains specific for the target protein is essential. Through the use of an- 
tibody recognition sequences for the protein binding ligand, it is possible to target virtually any protein in a bio- 
logical system for ubiquitin addition, and ubiquitin-directed protein degradation, in the manner described herein 

10 using an E2 fusion protein. All that seems to be required is for the target protein to have at least one lysine 
residue to which the ubiquitin may be attached. 

It is possible to produce the E2 fusion protein of Fig. 2 by a variety of processes for use in ubiquitin-directed 
protein degradation. For example, it is possible to clone and/or construct a DNA sequence which, in its entirety, 
codes for an E2 fusion molecule of the class shown in Fig. 2. The DNA coding sequence could include, working 

15 5* to 3', first the coding sequence for a selected E2 core region, such as those native sequences described in 
the sequences below, or could include a consensus or artificial E2 core region sequence. The DNA coding 
sequence would then include 3* to the core region sequence, a sequence encoding a spacer, or a tail region 
from E2. Again, this portion could be native or artificial, and both the core region and spacer could be from a 
single native E2, as in the case of UBC4 below. Finally, at its 3' end, the coding sequence would include the 

20 coding region for the protein binding ligand. When such a coding region for an E2 fusion protein is placed behind 
a promoter effective in a host of choice, and 5' to a transcription terminator, an expression cassette for the 
recombinant sequence expressing an E2 fusion protein, such as illustrated in Fig. 2, is created. That expression 
cassette can then be transformed into a heterologous host and expressed. Various transformation techniques 
for plants and animals are known in the art and need not be described further here. The E2 fusion protein thus 

25 expressed will then direct ubiquitin addition to a target protein of interest in vivo . This ubiquitin tagging will direct 
degradation by the cell, using its normal processes, of the target protein. In this way, transgenic organisms 
can be created which differ from their native or non-transformed ancestors by the active degradation of a spe- 
cific unwanted protein. In this way it is possible to "turn off 1 proteins which are not desired to be expressed. 
This approach can be used in animal, plant systems, microbial or fungal systems which has an active ubiquitin- 

30 directed proteolytic pathway, to turn off unwanted activities, structures, traits or activities. 

Alternatively, E2 fusion proteins such as those described-and illustrated in Fig. 2 can be produced by ex- 
pression in one host for ultimate delivery into another. For example, it is possible to take a synthetic coding 
sequence, as described in the preceding paragraph, and express that sequence in a prokaryotic host to pro- 
duce E2 fusion protein, assuming only that the promoter is properly chosen. For example, it is quite common 

35 to produce therapeutic proteins by the fermentation of prokaryotic bacteria, and then to, subsequently isolate 
and purify the desired protein. In this fashion, useful quantities of E2 fusion proteins can be produced and iso- 
lated. The protein can then be delivered to an organism for possible therapeutic or other treatment If production 
of the E2 fusion protein is performed outside of the host, it is envisioned that further treatment or modification 
of the E2 fusion protein may be desired. For example, liposome encapsulation of E2 fusion proteins may aid 

40 in their introduction into target cells. Alternatively, further protein domains could be added to the amino termi- 
nus of the E2 fusion proteins to target cellular receptors to induce introduction of the proteins into targeted 
cells in vivo . 

A wide variety of possible applications for this technology are envisioned. In plant systems, it becomes 
possible to target the degradation of a specif ic enzyme, to turn off an unwanted plant metabolic pathway. For 

45 example, it now becomes possible to alter the secondary metabolite products of a plant by targeting for deg- 
radation one or more enzymes in the cascade for the unwanted secondary metabolite. The E2 fusion protein 
approach offers the promise of another mechanism for the control of virus infection in plants, by targeting the 
degradation of either the viral coat protein or one or more viral enzymes, e.g. a transcriptase, necessary for 
viral replication or activity in infected plant cells. It also becomes possible to target the degradation of specific 

50 cellular receptors, to diminish sensitivity to one or more otherwise undesirable effects caused by exposure to 
some environmental stimulus. It is possible to target the degradation of specific plant phytochromes enzymes, 
so as to alter plant vigor and growth. 

In mammalian systems, analogous applications for this technology are also envisioned. Again, attempts 
can be made to interfere with viral pathogenicity by targeting for degradation either viral coat protein or en- 

55 zymes necessary for viral replication or infection. It is possible to target for the degradation of onco-proteins, 
as a possible therapeutic strategy to try to slow or alter the process of oncogenesis. It is possible to target for 
degradation unwanted growth factors, or factors associated with the processes which are desired to be hin- 
dered for some period of time, such as blood clot formation. In general, it is envisioned that this technology 

5 
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1 . Common Methods and Materials 
A. Materials 
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Oligonucleotides were synthesized as ^™ 
E... do Pont Restriction enzymes, M13mp 8 s.ngle ■*^ f ^ 8 ^^ MlBp alka iine phosphatase 
chased from New England Biolabs. VCS-M13 was ^^J^^^ P"^^^ Perkin Bmer 
(SAP) was purchased from United States B, ° ch ° m !^^ in Hatfield and Vierstra. 

Cetus. E1 protein was synthesized in E^oh uttong the TaUBAI ,„ Ciecnanover et 

J. Biol.Chem. , 265:15813-15817 (1992), and was ^.^^J^£Z^ 1M by the method of Haas 

S» 257:253 J- 254 1 5 1 4 9 9 8 60 (iTi) "^^SSSoSh carrl-free Na-I by the 
and Wilkinson, Biochem. Prep., 15.49-60 (1985), and was men 77 :1365-1368 (1980). 

chloramine-T method as described by Ciecnanover et al., P ^ ^ "he w heat germ extract was 

Rabbit reticulocyte extract (untreated) was pur^ 

prepared according to the method of Hatfield and v « ratra '^^^^^ ( Mpa i lNyf L 1 Oncogene 
clonai-antibodies derived from clone 9E10, and the c-myc ^ Phased from Kir- 

Science, Inc. Alkaline phosphatase-conjugated goat the epiderma. growth factor 

kegaard and Perry Laboratories Inc. Human epiderma ^^^^gtiA copy of the transforming 
receptor (EGFR) and purified epidermal fl™» ™£ ^J^* et al ., Cell. 38:287-297 (1984). 

Sc,:^ 

™ a - « ad H#»ctcribed forE. coli by Perbal, A Practical 

Chromosomal DNAfrom Staphylococcus aureus was pu n *d I as described for ^ y inoc 7JjSniT5 
«..,h. tn Macular Clonings (1984). E^cglj cell extracts infected w " ^jnht culture of the^ E. coli strain of 
^edia, containing tetracy^e at 12 ug/ml. with 1 10 0th vo V c^ 13(an . 

XLI-Blue. The culture ^^^^^^^^^^^ an additional 1 hour, after which 

tagene) at a multiplicity of infection of 10. Infected cells were inc ddjtiona| 6 nours . T he cells were 

kanamycin was added to 100 ug/ml, and the culture was ^^^^^^^ Vierstra, 

harvested and lysed as described for the E^oh cultures following * °" ^" g described above except 

Chem. . 266:23878-23885 (1991). The M13-uninfected cell extracts were prepared as descnoe 

that M13 phage and kanamycin were omitted. 

P r^mrtlon of UBC1 and UBC4 Expression Vector Cassettes. 

UnleS s stated otherwise, al, techniques were P"*^ 
brook etaLMolecyjajX^^ 

directed versions and insertion of the various DNA f ragment described below m P P f ^ 

confirmed by subsequent sequence analyse. All studies began wrth ^^ 5 . 2367Q . 23885 d »D. and 
wheat (TrmcMIl^aare) as described ,n Su '^n and Vierstra ■ ±Jgk£SgL» contained within the p ha- 
the UBC1 gene from A^bidopsisjhahana from Sullivan and V.erstra^ W , sup e ^ was 

gemid. P UC118 or pBluescriptf rom Stratagene, ^spectr,ely^A un J« ^ J^"*^ UBC4 regions 
Siaced immediately upstream of the translation terminahon s tes n both ^ e UB ^\^^ > 154:367-382 
Dy site-directed mutagenesis performed as described by ^^^^^55.^1 31 which are 
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The mutagenized UBC4 cDNA was ligated into the pET 3a plasmid vector containing UBC4 (Sullivan and Vier- 
stra, 1991)asaSphl/BamH1 cassette replacing the wild-type UBC4gene in that vector. This replacement cre- 
ated an expression plasmid designated pET-UBC4. Insertion of the Xhol site into UBC1 and UBC4 resulted in 
the addition of a dipeptide, Leu-Glu, to the carboxyl terminus of the native protein sequence of each protein 
5 as encoded by the respective DNA expression plasmids. 

C. Synthesis of UBC1 and UBG4 Proteins With Carboxyl-Terminal Additions , 

Additions to the Carboxyl-terminus of UBC1 and UBC4 proteins was accomplished by ligation of appro- 
10 priate synthetic oligonucleotide pairs or double stranded DNA fragments into the Xhol site at the 3' end of the 
corresponding DNA coding sequences for the respective proteins. A phosphate group was first added to the 
5'-end of the synthetic nucleotides using T4 polynucleotide kinase. Reactions containing the oligonucleotide 
at a final concentration of 6.67 |jM, 400 u.M ATP, and 0.33 units/uJ of T4 polynucleotide kinase dissolved in T4 
polynucleotide kinase buffer, and were incubated at 37°C for 30 minutes. The complementary oligonucleotide 
15 pairs were annealed to each other through the addition of one-tenth volume of 1 0x Annealing Buffer (Biorad) 
and the mixture heated to 80°C. Reaction mixtures were then allowed to cool to room temperature over ap- 
proximately 1.5 hours. 

Prior to the insertion of the altered DNA coding sequences including carboxyl terminal additions into the 
expression plasmids, pET-UBC1 and pET-UBC4, the plasmids were digested with Xhol and treated with shrimp 
20 alkaline phosphatase (SAP) to reduce the frequency of self ligation. Dephosphorylation of Xhol-digested plas- 
mids was performed using SAP as using the methodology described by the supplier. After dephosphorylation, 
the remaining enzyme was denatured by heating the entire reaction mixture to 70°C and holding it for 10 min- 
utes. 

25 D. Construction of c-myc expression vectors. 

This example was intended to demonstrate the use of an epitope as the protein binding ligand. The epitope 
chosen, because of convenient access to the monoclonal antibody, was c-myc. The oligonucleotide pair, des- 
ignated RV138 and RV139, presented as SEQ ID NOS: 7 and 9 respectively below, are designed to form a 

30 double stranded DNA cassette that encodes a 10-amino-acid epitope SEQ ID NO: 8 recognized by the mouse 
monoclonal antibody designated clone 9E10. The mouse monoclonal antibody clone 9E10 was generated 
against the oncogene protein c-myc as described by Evan, et al., Mol. Cel. Biol. 5:12:3610-3616 (1985), and 
the ten amino acid epitope for the anti-c-myc antibody is described in Koledziej and Young, Methods Enzymol. 
194:508-519 (1991). The c-myc epitope cassette was ligated into the Xhol site of pET-UBC1 and pET-UBC4 

35 to create two plasmids then designated pET-UBCI-(c-myc) and pET-UBC4-(c-myc). The presence of the c- 
myc epitope on the protein expressed on each of these plasmids was established by inducing expression of 
the E2 proteins made by these plasmids and then screening for c-myc positive strains by immunoblot analysis 
using with the c-myc specific antibody. In this way, it was assured that the plasmids properly expressed the 
epitope recognized by that antibody. 

40 

E. Construction of a UBCI-spacer-(c-myc) expression vector . 

In addition to constructing a vector in which the c-myc epitope was ligated to the carboxyl terminus of the 
UBC1 E2 protein, it was also desired to add a spacer between the core region of the UBC1 E2 molecule and 

45 the protein-binding ligand, in this case the c-myc epitope. To do this, an oligonucleotide pair designated RV136 
and RV137, setforth as SEQ ID NOS: 16 and 1 8 below, was designed. This oligonucleotide pair, when annealed 
and expressed in a host, is intended to create a 10 amino acid spacer consisting of the amino acids setforth 
in SEQ ID NO: 17. This cassette was further designed to ligate into Xhol site of pET-UBC1. Upon ligation of 
the spacer oligonucleotides, the original Xhol site in pET-UBC1 was lost and a new Xhol site was created at 

so the 3-'end of the spacer cassette to allow for further insertions. After insertion of the spacer cassette, the c- 
myc cassette was then ligated into the Xhol site in the spacer of pET-UBC1 -spacer, creating the new expres- 
sion vector plasmid designated pET-UBCI-spacer-(c-myc). This construction was intended to demonstrate the 
sufficiency of an artificial spacer between the core protein of the E2 and the protein binding ligand. This con- 
struct, and its use is generally illustrated in Fig. 3. 

55 

F. Construction of a UBC4-TGFa expression vector . 

To obtain a sequence for a TGFa coding region, it was decided to clone by PCR a DNA region encoding 

7 
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amino acid 41-89 of the human peptide hormone transforming growth factor a (TGFa). The activity of the TGFa 
hormone has been determined to result from its conjugation to a specific cellular receptor, known as the epi- 
dermal growth factor receptor (EGFR) present as a cellular receptor in many cells in a human body. The portion 
of the coding sequence for the TGFa gene (amino acid 41-89 of the native protein), includes the binding ligand 
which natively binds to the EGF receptor. 

Shown as SEQ ID NOS: 10 and 11 below are a pair of oligonucleotides designated RV151 and RV154, 
which were designed to be primers for amplification of the desired coding region from TGFa by polymerase 
chain reaction (PCR). The PCR template consisted of a plasmid containing the cDNA copy of the TGFa pro- 
hormone form, the sequence of which can be found in GENEBANK accession # M31172. Additional DNA se- 
quence was added to oligonucleotide RV151 to encode a Xhol site. The PCR was performed on the template 
to create multiple copies of a PCR product. The Sail site was then used to ligate the PCR product into the Xhol 
site of pET-UBC4 to create an in-frame coding region fusion between the coding region for UBC4 and TGFa. 
Likewise, oligonucleotide RV154 was designed to contain a stop codon at the 3-end of the hormone coding 
region followed by a Sail site. The product of PCR amplification was digested with Sail and Xhol, and was then 
ligated into the Xhol site of pET-UBC4, to create an in-frame fusion of UBC4 and TGFa proteins while main- 
taining the Xhol site at the 5'-end thereof. This plasmid was designated pET-UBC4-TGFa). This fusion protein 
and its use is schematically illustrated in Fig. 4. 

G. Construction of a UBC4-GENEV expression vector. 



GeneV is a protein from the M13 phage. In its native form, the protein associates in homo-dimers. The 
intent of this expression was to test the ability of a single subunit of a dimer to serve as a protein binding ligand 
within the present invention. Again the protein coding sequence of interest was prepared by PCR reaction. The 
oligonucleotide pair, RV220 and RV221, SEQ ID NOS: 13 and 12 respectively, were designed to amplify by 

25 PCR the complete coding region of the GeneV protein from the bacteriophage M13, as setforth in GENEBANK 
accession # VB001 8. The oligonucleotides also include unique Xhol and Sail sites such that the GeneV coding 
region amplified by PCR could be digested with Xhol and Sail and the resulting 271 base pair fragment then 
could be ligated into the Xhol site of PET-UBC4 and the same method described with the UBC-TGFa construct 
described above. Again this construct made, in a similar fashion, an in-frame fusion of the UBC4 protein with 

30 the GeneV protein domain, and the construction was designated PET-UBC4- GeneV. This construction and 
its use are schematically illustrated in Fig. 5. 

H. Construction of a UBC4-Prote in A expression vector . 

35 ProteinA from Stapholoccus aureus has a high natural affinity for many classes of antibodies. The anti- 

body-binding region of ProteinA binds to the Fc portion of antibodies rather than to the antigenic recognition 
site. This was used to test to see of antibodies could be targeted for destruction, although this test would be 
generic to antibodies and not specific to the epitope recognized by the antibodies. This example is also in- 
tended to demonstrate that classes of proteins can be targeted through the use of a domain for the protein 
40 binding ligand that recognizes classes of proteins. 

Again it was decided to amplify by PCR the coding region of the antibody-binding D domain (amino acids 
90-193) of ProteinA, as set forth in GENEBANK Accession # M18264. A pair of oligonucleotide primers, des- 
ignated RV242 and RV238, setforth as SEQ ID NOS: 14 and 15 was designed to both amplify the PCR product 
and add the appropriately desired restriction sites at the end of the PCR product The PCR process was per- 
formed on S. aureus chromosomal DNA. Again, the oligonucleotides provided for Unique Xhol sites at each 
ends of oligonucleotide to allow the amplified fragment to be digested with Xhol, and then to be ligated into 
the Xhol site of pET-UBC4. This insertion created an in frame fusion of UBC4 protein with the ProteinA D do- 
main and the resulting plasmid was designated pET-UBC4-ProteinA. The presence of antibody-binding domain 
in the expressed pET-UBC4-ProteinA construct was demonstrated by expression in E. coli and immunoblot 
analysis using alkaline phosphatase-conjugated immunoglobulin G. The resulting fusion protein is schemati- 
cally illustrated in Fig. 6. 



2. Ubiquitin conjugation assays . 

55 A. Expression and assay of the UBC1 and UBC4 constructs . 

All pET3a expression plasmids containing the UBC1 and UBC4 expression cassettes were transformed 
into E. coli strain BL21(DE3). Following induction of pET3a expression cassette by the addition of isopropyl 



BNSDOCID: <EP_ 



EP 0 626 450 A2 



p-D-thiogalactopyranoside to logarithmic growth phase cultures, the cells were harvested and lysed as descri- 
bed in Sullivan and Vierstra, (1991) supra. All experiments were performed using crude lysates of cells con- 
taining the induced plasmids. Ubiquitin conjugation assays with theses lysates were performed as previously 
described in Sullivan and Vierstra, (1991) supra , except that the incubation time for all assays was 2 hours. 
5 Each of the reactions was formed in 20 \i\ total volume containing 1-4 u.l of bacterial extracts harboring the 
expressed E2 fusion protein molecules, 12 u.g/ml of purified E1, 0.52 |xg of 125 l-ubiquitin, 1 unit of inorganic 
pyrophosphatase (pyrophosphate phosphohydrolase, EC 3.6.1.1) in 20 u.l of 50 mM Tris (pH 7.6 at 25°C), 10 
mM MgCI 2 , 1 mM ATP, 0.1 mM dithiothreitol and varying concentrations of the substrate. Prior to performing 
these conjugation assays, the activity of each E2 fusion molecule was determined by its ability to accept ae- 
ro tivated ubiquitin from E1 alone and bind to the ubiquitin via a thiol-ester bond by the method described in Sul- 
livan and Vierstra, (1991) supra . Based on this thiol-ester assay, the volume of bacterial extracts that contained 
equivalent amounts of E2 activity was determined for each construction. This normalized volume was added 
to the various conjugation assays. The conjugation assays were terminated by adding an equal volume of 25 
mM Tris-HCI (pH 6.8), 5% (v/v) glycerol, 4% (w/v) sodium dodecyl sulfate, 10% (v/v) 2-mercaptoethanol to the 
15 reactions and heating the mixture to 100°C for 5 minutes. Samples were subjected to SDS-PAGE using the 
system of Laemmli, Nature , 227:680 (1970). The gels were then stained with Coomassie Blue, dried between 
sheets of cellophane, and used for autoradiography. This is intended to visualize the size of any proteins which 
have bound the radiolabeled ubiquitin to thus indicate if the radiolabeled ubiquitin molecule has been properly 
attached to the target protein of the expected size. 

20 

B. Conjugation of ubiquitin to immunoglobulins in the presence of eukarvotic extracts . 

The formation of ubiquitin-antibody conjugates in the presence of wheat germ extracts or rabbit reticulo- 
cyte lysates were as described by Hatfield and Vierstra (1989), supra, for wheat germ. The reaction mixtures 

25 for these reactions were 20 pJ total, containing 12 ng/ml of purified wheat E1, 50 ^ig/ml of human ubiquitin, 
1 00 fig/ml of anti-S3 mono-clonal antibody, 30 units/ml of creatine kinase, and 4 jxl of wheat germ extract or 
rabbit reticulocyte lysate (Promega) in 80 mM Tris (pH 8.5 at25°C), 20 mM creatine phosphate, 7.5 mM MgCI 2 , 
2 mM ATP, and 2 mM dithiothreitol. Each of the reactions was initiated by adding 2 \l\ of E2 fusion protein ex- 
tracts and incubated at 30°C for the indicated time periods. The reactions were terminated in the same manner 

30 as the expression conjugation assays described above. The samples were subjected to SDS-PAGE and the 
proteins were electroblotted to Immobilon-P. The antibody-conjugates were identified by immunoblot analysis 
using alkaline phosphatase- conjugated goat-anti-mouse immunoglobulins as described previously in Sullivan 
and Vierstra, supra. 

35 C. Results of Conjugation Assays . 

It was determined that in all cases E. coli was capable of expressing the chimeric UBC1 and UBC4 genes 
and to synthesize the synthetic E2 fusion proteins of the expected and desired sizes. All of E2 fusion proteins 
thus created were enzymatically active, based on their ability to interact with ubiquitin via the formation of a 

40 thiol ester bond to ubiquitin. When unmodified UBC1 or UBC4 was tested in ubiquitin conjugation assays, little 
or no conjugation was observed to the desired substrate tested. Conversely, when the appropriate fusions 
were tested, highly specific conjugation to the various substrates was observed, as demonstrated by appro- 
priately sized bands on the radiolabeled blots indicating that proteins of the expected size were tagged with 
the radiolabeled ubiquitin. Thus the conjugation was detected by the attachment of free 125 l-labeled ubiquitin 

45 to the target via a peptide bond. The formation of such ubiquitin-protein conjugates was visualized by the au- 
toradiography as a mobility shift during the SDS-page analysis of ubiquitin from that of the free form to that 
of a protein expected to contain both ubiquitin and a substrate protein. 

As an example, whereas UBC4 could not conjugate ubiquitin to the c-myc monoclonal antibody, the fusion 
protein expressed by the plasmid pET-UBC4- (c-myc) (Fig. 3) could in fact conjugate ubiquitin to the c-myc 

so monoclonal antibody. In this particular experiment, a single ubiquitin was added to the heavy chain of the an- 
tibody. The ubiquitin-antibody conjugate migrated at the expected molecular mass of the heavy chain of the 
antibody (55 kDa) with the addition of a single ubiquitin moiety (6 kDa). The presence of the antibody in the 
conjugate was confirmed by its immunorecognition by ProteinA. The reaction was judged to be specific for 
the c-myc antibody based on the fact that other mouse monoclonal antibodies failed to be conjugated by the 

55 E2 fusion protein expressed by the pET-UBC4-(c-myc) vector and also by the fact that addition of excess c- 
myc peptide blocked the antibody conjugation reaction. 

Using the UBC1 E2 variant in which the c-myc epitope was fused directly to the UBC1 core region, attempts 
to perform similar experiments to add an ubiquitin conjugate to an antibody failed. Other experiments had hint- 

9 
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ed that perhaps a spacer arm, already present in UBC4, was required. Accordingly, the construct described 
above designated pET-UBCI-spacer-(c-myc) was constructed to test whether an artificial spacer arm would 
suffice to render the UBC1 conjugate active. When the fusion protein expressed by the pET-UBC1-spacer-(c- 
myc) expression cassette was tested, it was found capable of specifically conjugating the radiolabeled ubiquitin 

5 to the c-myc monoclonal antibody. This result demonstrated that the protein binding ligand on the fusion protein 
which is specific to the target protein must be physically placed beyond the body of the E2 core region in such 
a fusion protein. As with UBC4-(c-myc), ubiquitin conjugated by the fusion protein including UBC1-spacer-(c- 
myc) was specif ic for the c-myc antibody, and it did not modify other mouse monoclonal antibodies. The fusion 
protein was also blocked by the addition of excess f ree-c-myc peptide to the reaction mixture. 

10 To further test the ability of the system to conjugate ubiquitin to monoclonal antibodies, the construct in- 

cluding the domain D from ProteinA was also tested. The E2 fusion protein created by the plasmid designated 
pET-UBC4-ProteinA was added to conjugation reactions containing purified antibodies that naturally interact 
with ProteinA. In these reactions, attachment of the ubiquitin moiety to the heavy chain of the antibodies was 
observed. In contrast, no such conjugation was observed when wild-type UBC4 was used in similar reactions. 

15 The ability to attached ubiquitin to the antibody correlated with the known affinity of ProteinA to the various 
antibody classes. For example, whereas mouse immunoglobulin G, which binds to ProteinA tightly, were quite 
effectively conjugated by the E2fusion protein containing UBC4 and ProteinA, an immunoglobulin G from goat, 
which normally binds weakly if at all to ProteinA, was not similarly conjugated. Again the results were deter- 
mined by radiolabeled blotting. 

20 The interaction of the GeneV protein, which associates with itself into homodimers, which homodimers in 

turn bind cooperatively with single-stranded DNA, was also tested utilizing the UBC4-GeneV construct. First 
it was determined that a construct expressing the UBC4 protein itself was unable to conjugate ubiquitin to Gen- 
eV protein. Similar experiments were performed using the E2 fusion proteins created by the plasmid pET- 
UBC4-GeneV. That fusion protein was found capable of creating ubiquitin-GeneV conjugates. In this case, the 

25 recognition of GeneV is apparently accomplished by the dimerization between the wild-type GeneV and the 
GeneV domain of the E2 fusion protein consisting of both UBC4 and GeneV. This demonstrates that dimeri- 
zation-type affinities between the protein binding domain of the E2 fusion molecules constructed in accor- 
dance with the present invention will sufficiently bind to target proteins so as to allow them to be the target for 
ubiquitin fusion. 

30 To test the ability of the E2 fusion protein strategy described herein to conjugate ubiquitin to hormones or 

receptors, the E2 fusion protein consisting of UBC4 and TGFa was utilized. Both TGFa and the related peptide 
hormone, epidermal growth factor, are capable of highly specific and very tight binding to the EGF receptor. 
When the E2 fusion protein expressed by the plasmid pET-UBC4-TGFa was added to crude human epidermal 
membranes, the result was highly specific modification of the EGFR protein by conjugation with ubiquitin. This 

35 conjugation was specific for the TGFa- EGFR pair as judged by the failure of unmodified UBC4 protein to ubiq- 
uitinate the receptor and the ability of free, excess, epidermal growth factor in the reaction mixture to block 
the reaction and prevent ubiquitination of the EGF receptor. The result also clearly demonstrates that species 
differentiations between E2s and ubiquitins are not critical to this reaction since the UBC4 plant origin is quite 
clearly capable of ubiquitinating the target molecule of mammalian origin in this reaction. Since M13 is a bac- 

40 teriophage, the same phenomenon can be used on bacterial targets as well. 

It is has been observed that selective proteolytic degradation by the ubiquitin-directed protein degradation 
pathway appears to involve the conjugation of multiple ubiquitins to the target protein, in many cases forming 
a multiubiquitin chain. In the conjugation assays described above, using the bacterial-expressed E2s, generally 
only attachment of a single ubiquitin to the target molecule was detected in most cases. As a result, it was pos- 

45 sible to assert that ubiquitination by the E2 carboxyl-terminal fusion proteins described herein would not form 
the multiubiquitinated intermediates necessary to cause the targeted protein to enter into the degradative path- 
way. 

To test whether that limitation was a real one, experiments were conducted in which crude eukaryotic cell 
extracts, either rabbit reticulocyte extract or wheat germ extract, were added to the ubiquitin conjugation as- 

50 says. Since such crude extracts often contain endogenous multiubiquitin chains, the idea was to test to see 
of such multiubiquitin chains could be added by the E2 fusion proteins described herein to the target molecule. 
Such attachment of multiubiquitin chains to the target molecules was observed. For example, in the absence 
of such extracts, only a single ubiquitin becomes attached to mouse immunoglobulin G in the presence of E2 
fusion protein consisting of UBC4-ProteinA. But upon the addition of either wheat germ extract or rabbit reti- 

55 culocyte extracts to the reaction mixture, the same system was capable of generating ubiquitinated forms of 
mouse immunoglobulin G with as many as 7-8 ubiquitin repeats attached to the antibody. Based on available 
evidence to date, such heavily modified forms represent acceptable substrates for subsequent degradation 
by the ubiquitin system and are highly likely to be recognized by that system and then subjected to proteolytic 

10 
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degradation. This results demonstrate that the E2 fusion proteins described herein are capable of generating 
proteolytic intermediates with the help of other endogenous factors normally present within eukaryotic cells. 

Hypothetical Example 

5 

The experiments above demonstrate that the specificity of the ubiquitin conjugation can be modified in 
predetermined ways to target new proteins for degradation. The versatility of this approach depends on iden- 
tifying a protein binding domain which can be attached to the E2 fusion molecule which will then bind to the 
protein of interest. Obviously, because the information is limited as to the nature of the interaction between 

10 many proteins and other proteins within the cell, the use of natural protein-protein interactions would restrict 
the technology present to only a few well characterized types. However, the exploitation of antibody/antigen 
reactions has the power to overcome this obstacle. Bt is possible, of course, to create antibodies which bind 
for most specific proteins of interest. The binding domains (Fab) of such antibodies, which involve amino acids 
from both the heavy and light immunoglobulin chains, can now be identified and expressed as single shorter 

15 peptides which are referred to as single chain monoclonal antibodies. The genes for the single chain mono- 
clonal antibodies express Fab region fragments linked by a shorter flexible spacer region. This concept is il- 
lustrated in Fig. 7. It is intended that such single chain monoclonal antibodies can be fused to the carboxyl 
terminus of E2s like UBC4, to create E2 fusion proteins which can be targeted through the Fab region to any 
protein of interest for which a monoclonal antibody is either available or can be developed. There has recently 

20 been commercialized a kit for constructing this type of antibody which allows the facile development of single 
chain antibodies genes against any suitable antigenic protein. The availability of this technology suggests that 
E2 fusion proteins with an Fab protein ligand binding region can be constructed to target virtually any proteins 
for degradation using this system. 

25 
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SEQUENCE LISTING 



5 (1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: WISCONSIN ALUMNI RESEARCH FOUNDATION 

(B) STREET: P.O. Box 7365 

(C) CITY: Madison 

10 (D) STATE: Wisconsin 

(E) COUNTRY: U.S.A. 

(F) POSTAL CODE (ZID: 53707-7365 

(ii) TITLE OF INVENTION: UBIQUITIN CONJUGATING ENZYME (E2) FUSION 
15 PROTEINS 

(iii) NUMBER OF SEQUENCES: 18 

(iv) COMPUTER READABLE FORM: 
20 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0. Version #1.25 (EPO) 

25 (v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 94303903.2 



30 



(2) INFORMATION FOR SEQ ID N0:1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 757 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 
<° (A) NAME/KEY: CDS 

(B) LOCATION: 100.. 558 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

CGGCGGTCAA CACCGCTGAA CACATATGAA AGAAAGACGA CCTCTTCTCT CCGCGATCTT 60 

TACCTCAACA ACGAGATCTG TTTCCAGAAA GAAAGGAGG ATG TCG ACG CCA GCA 114 

Met Ser Thr Pro Ala 

1 5 

AGG AAG AGG TTA ATG AGG GAT TTC AAG AGG TTG CAG CAA GAC CCA CCT 162 
Arg Lys Arg Leu Met Arg Asp Phe Lys Arg Leu Gin Gin Asp Pro Pro 
y 10 15 20 

GCG GGT ATT AGT GGT GCT CCA CAG GAC AAC AAC ATT ATG CTC TGG AAT 210 
Ala Gly He Ser Gly Ala Pro Gin Asp Asn Asn He Met Leu Trp Asn 
25 30 35 



45 



50 



55 
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GCT GTC ATA TTT GGG CCT GAT GAC ACA CCA TGG GAT GGA GGT ACT TTC 258 
Ala Val He Phe Gly Pro Asp Asp Thr Pro Trp Asp Gly Gly Thr Phe 
40 45 50 

AAA CTC TCA CTG CAG TTC TCT GAA GAT TAT CCC AAT AAA CCA CCA ACA 306 
Lys Leu Ser Leu Gin Phe Ser Glu Asp Tyr Pro Asn Lys Pro Pro Thr 
55 60 65 

GTT CGG TTT GTG TCA CGG ATG TTT CAT CCT AAT ATT TAT GCA GAT GGG 354 
Val Arg Phe Val Ser Arg Met Phe His Pro Asn lie Tyr Ala Asp Gly 
70 75 80 85 

AGT ATC TGC TTG GAC ATT CTA CAA AAC CAG TGG AGT CCA ATC TAT GAT 402 
Ser He Cys Leu Asp He Leu Gin Asn Gin Trp Ser Pro lie Tyr Asp 
90 95 100 

GTT GCT GCT ATA CTT ACC TCC ATC CAG TCC TTG CTC TGT GAC CCT AAT 450 
Val Ala Ala He Leu Thr Ser lie Gin Ser Leu Leu Cys Asp Pro Asn 
105 110 115 

CCG AAT TCT CCT GCA AAC TCG GAA GCT GCT CGG ATG TAC AGC GAA AGC 498 
Pro Asn Ser Pro Ala Asn Ser Glu Ala Ala Arg Met Tyr Ser Glu Ser 
120 125 130 

25 AAG CGC GAG TAC AAC AGG AGA GTG CGT GAT GTT GTT GAG CAA AGC TGG 546 
Lys Arg Glu Tyr Asn Arg Arg Val Arg Asp Val Val Glu Gin Ser Trp 
135 140 145 



15 



20 



30 



35 



40 



45 



55 



ACT GCT GAC TAGTAGTAGT TTGTTGTAAG CGTTGTAGCT CTCTCTACTT 595 

Thr Ala Asp 

150 

TCTCTCAATC ACGATTCAGC AACAGCTTTC TTCTCTTTTC ATTCATGTCT TGTGTTTCCA 655 

AAACTATTTA AGTGATTCCA TGCTTTGATG TAACCCAACA TCCTTAAAAA AACAACTTTG 715 

TACCAAACCA TCTGAATTAT TCACTTTTGT GTATAAAAAA AA 757 



(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 152 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

so Met Ser Thr Pro Ala Arg Lys Arg Leu Met Arg Asp Phe Lys Arg Leu 
15 10 15 

Gin Gin Asp Pro Pro Ala Gly He Ser Gly Ala Pro Gin Asp Asn Asn 
20 25 30 



He Met Leu Trp Asn Ala Val He Phe Gly Pro Asp Asp Thr Pro Trp 
35 40 45 
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Asp Gly Gly Thr Phe Lys Leu Ser Leu Gin Phe Ser Glu Asp Tyr Pro 

50 55 60 

Asn Lys Pro Pro Thr Val Arg Phe Val Ser Arg Met Phe His Pro Asn 
65 70 75 80 

He Tyr Ala Asp Gly Ser lie Cys Leu Asp lie Leu Gin Asn Gin Trp 
85 90 95 

Ser Pro lie Tyr Asp Val Ala Ala lie Leu Thr Ser lie Gin Ser Leu 
100 105 110 

Leu Cys Asp Pro Asn Pro Asn Ser Pro Ala Asn Ser Glu Ala Ala Arg 
115 120 125 

Met Tyr Ser Glu Ser Lys Arg Glu Tyr Asn Arg Arg Val Arg Asp Val 
130 135 140 

Val Glu Gin Ser Trp Thr Ala Asp 
20 145 150 



(2) INFORMATION FOR SEQ ID NO:3: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 980 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



75 



30 



35 



SO 



55 



(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 60.. 614 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGAATTCCCA AACCTACAAG CAGGGCAAGG AGGAGGAGGA AGAAGAAGAA GAAGCAAAC 59 

40 ATG TCT TCC CCA AGC AAG CGC AGG GAG ATG GAT CTC ATG AAG CTG ATG 107 
Met Ser Ser Pro Ser Lys Arg Arg Glu Met Asp Leu Met Lys Leu Met 
15 10 15 

ATG AGT GAC TAC AAG GTG GAC ATG ATC AAC GAC GGG ATG CAC GAG TTC 155 
45 Met Ser Asp Tyr Lys Val Asp Met He Asn Asp Gly Met His Glu Phe 

20 25 30 

TTC GTC CAC TTC CAC GGA CCC AAA GAC AGT ATT TAC CAG GGT GGT GTG 203 
Phe Val His Phe His Gly Pro Lys Asp Ser He Tyr Gin Gly Gly Val 
35 40 45 



TGG AAG GTC AGG GTT GAA CTC ACC GAA GCT TAC CCT TAC AAA TCC CCT 251 
Trp Lys Val Arg Val Glu Leu Thr Glu Ala Tyr Pro Tyr Lys Ser Pro 
50 55 60 

TCC ATT GGC TTC ACC AAC AAG ATC TAT CAC CCC AAT GTC GAT GAG ATG 299 
Ser He Gly Phe Thr Asn Lys He Tyr His Pro Asn Val Asp Glu Met 
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65 70 75 80 

TCT GGT TCT GTC TGC TTG GAT GTG ATC AAT CAG ACA TGG AGC CCG ATG 347 
Ser Gly Ser Val Cys Leu Asp Val He Asn Gin Thr Trp Ser Pro Met 
85 90 95 

TTT GAC CTT GTG AAT ATC TTT GAG GTG TTC CTG CCC CAG CTT CTC CTG 395 
Phe Asp Leu Val Asn He Phe Glu Val Phe Leu Pro Gin Leu Leu Leu 
100 105 110 

TAC CCG AAC CCC TCG GAC CCC TTG AAC GGC GAG GCG GCT TCG CTC ATG 443 
Tyr Pro Asn Pro Ser Asp Pro Leu Asn Gly Glu Ala Ala Ser Leu Met 
115 120 125 

ATG CGC GAC AAG AAT GCC TAT GAA AAT AAA GTC AAA GAA TAT TGT GAG 491 
Met Arg Asp Lys Asn Ala Tyr Glu Asn Lys Val Lys Glu Tyr Cys Glu 
130 135 140 

AGA TAT GCC AAG CCT GAA GAT ATA TCC CCA GAG GAG GAA GAG GAG GAG 539 
20 Arg Tyr Ala Lys Pro Glu Asp lie Ser Pro Glu Glu Glu Glu Glu Glu 
145 150 155 160 

AGT GAT GAG GAG CTG AGC GAC GCC GAG GGC TAC GAC TCC GGC GAC GAG 587 
Ser Asp Glu Glu Leu Ser Asp Ala Glu Gly Tyr Asp Ser Gly Asp Glu 
165 170 175 



15 



25 



30 



35 



GCC ATC ATG GGC CAC GCA GAC CCT TAACTGGTGG ATGGATGCAA GGATGGTTAG 641 
Ala lie Met Gly His Ala Asp Pro 

180 185 



CTCAGTCAGT AACTCAGTAA TGCAGGTGAT CATGATGTAT CTCTGTCTGT CAGTCTGTAC 701 

ATAGCTGCGG CGATCACTGA TGAATGCCGC CATGGCAGAT GCTGAAGAAA GTCATCAGCC 761 

ATCTCAACTC AGCTCCACTA GTTCTTGTGT GTCCCGCTGT GAATAACTTG CCATTTGTTT 821 

GTGTTGGTTC CATTTGCAGT TCATGTTTCC ATTCTAGGAG ATGTCTGTTC TTCTGTTTTG 881 

TTGATTTCAT TTCCAGTTCA TGTTACTACT GTATGTTTCC CTTTCCTACC TGTAATCATC 941 

40 TCAGGGGAAT TTAAATCTGC TCTGCATGTC CAGGAATTC 980 



(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 184 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Ser Ser Pro Ser Lys Arg Arg Glu Met Asp Leu Met Lys Leu Met 
15 10 15 

Met Ser Asp Tyr Lys Val Asp Met He Asn Asp Gly Met His Glu Phe 
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20 25 30 

Phe Val His Phe His Gly Pro Lys Asp Ser He Tyr Gin Gly Gly Val 
35 40 45 

Trp Lys Val Arg Val Glu Leu Thr Glu Ala Tyr Pro Tyr Lys Ser Pro 
50 55 60 

Ser He Gly Phe Thr Asn Lys lie Tyr His Pro Asn Val Asp Glu Met 
65 70 75 80 

Ser Gly Ser Val Cys Leu Asp Val He Asn Gin Thr Trp Ser Pro Met 
85 90 95 

Phe Asp Leu Val Asn He Phe Glu Val Phe Leu Pro Gin Leu Leu Leu 
100 105 110 

Tyr Pro Asn Pro Ser Asp Pro Leu Asn Gly Glu Ala Ala Ser Leu Met 
115 120 125 

Met Arg Asp Lys Asn Ala Tyr Glu Asn Lys Val Lys Glu Tyr Cys Glu 
130 135 140 

Arg Tyr Ala Lys Pro Glu Asp lie Ser Pro Glu Glu Glu Glu Glu Glu 
145 150 155 160 

Ser Asp Glu Glu Leu Ser Asp Ala Glu Gly Tyr Asp Ser Gly Asp Glu 
165 170 175 

Ala He Met Gly His Ala Asp Pro 
180 



(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
GGCCACGCAG ACCCTCTCGA GTAGGATGGA TGCAAGG 
(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
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GCAAAGCTGG ACTGCTCTCG AGTAGTAGTT TGTTGTAAGC G 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 6. .35 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

TCGAG GAG CAG AAG CTG ATC AGC GAG GAG GAC CTG TAAC 

Glu Gin Lys Leu lie Ser Glu Glu Asp Leu 
15 10 



(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

Glu Gin Lys Leu He Ser Glu Glu Asp Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
TCGAGTTACA GGTCCTCCTC GCTGATCAGC TTCTGCTCC 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10: 
CCCGCCCGTG GCTGCACTCG AGGTGTCCCA TTTTAATGAC TGCCC 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGCCTGCTTC TTCTGGCTGG CGTCGACCTA GGCCAGGAGG TCCGCATGC 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
AGGTAACTCG AGATGATTAA AGTTGAAATT AAACC 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CGACCTGGTC GACGTTACTT AGCCGGAACG AGGC 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CTTAATGACC TCGAGGCTCC AAAAGCTGAT GCGCAAC 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
GTTGAAATTC TCGAGTTATT TCGGTGCTTG AGATTCG 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3.. 22 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

TC GAA CCA CCA GTC GAC GCA GCA GCA GCA GCA CTCGAGT 
Glu Pro Pro Val Asp Ala Ala Ala Ala Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17: 

Glu Pro Pro Val Asp Ala Ala Ala Ala Ala 
15 10 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TCGAACTCGA GTGCTGCTGC TGCTGCGTCG ACTGGTGGT 
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Claims 

5 1. A DNA sequence encoding a ubiquitin-conjugating E2 fusion protein, the DNA sequence comprising: 

a promoter effective in the cells of a host to express a protein coding sequence located 3* to the 
promoter; and 

a protein coding sequence located 3' to the promoter, the protein coding sequence including 5* to 

3*: 

10 a DNA coding sequence encoding a ubiquitin-conjugating E2 core region; 

a DNA coding sequence encoding a spacer of at least four amino acids; and 
a DNA coding sequence encoding a heterologous protein binding ligand having affinity for a target 
protein. 

15 2. A DNA sequence according to claim 1 wherein the E2 core region is from a plant E2 gene. 

3. A DNA sequence according to claim 2 wherein the E2 core region is selected from the group consisting 
of the core regions of UBC1, shown in SEQ:ID:NO:1 and UBC4, shown in SEQ:ID:NO:3. 
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4. A DNA sequence according to any one of the preceding claims wherein the DNA coding sequences for 
both the core region and the spacer are from a single native E2 coding region. 

5. A DNA sequence according to claim 4 wherein the DNA coding sequences for both the core region and 
the spacer are from UBC4, shown in SEQ:ID:NO:3. 

25 6. A DNA sequence according to any one of claims 1 to 3 wherein the spacer is an artificial amino acid se- 
quence. 

7. A DNA sequence according to any one of the preceding claims wherein the protein binding ligand is a rec- 
ognition domain of an antibody, the domain having binding specificity for the target protein. 

30 

8. A DNA sequence as claimed in any one of claim 1 to 6 wherein the protein binding ligand is a domain of 
protein A having binding specificity for antibodies. 

9. A DNA sequence according to any one of claims 1 to 6 wherein the protein binding ligand is selected from 
35 the group consisting of protein hormones and cellular receptors for protein hormones. 

10. A DNA sequence according to any one of claims 1 to 6 wherein the protein binding ligand is an epitope 
recognized by an antibody. 

11. A replication and expression vector comprising a DNA sequence according to any one of the preceding 
40 claims. 

12. Host cells transformed or transfected with a replication and expression vector according to claim 11. 

13. A non-human transgenic eukaryotic organism comprising in its genome the DNA of any one of claim 1 to 
45 10. 

14. A non-human transgenic eukaryotic organism as claimed in claim 13 wherein the organism is a plant 

15. A method of producing a ubiquitin-conjugating E2 fusion protein which comprises culturing host cells ac- 
^ cording to claim 12 under conditions effective to express a ubiquitin-conjugate E2 fusion protein compris- 
ing, from amino to carboxyl terminus: 

a core region derived from a ubiquitin-conjugating E2 protein and which is functionally able to form 
a thiol linkage to a ubiquitin protein by transesterif ication from an E1 protein; 

a spacer of at least four amino acids; and 
55 a heterologous protein binding ligand having specific binding affinity for at least one target protein, 

so that the entire E2 fusion protein will be capable of forming a thiol linkage to at least one ubiquitin moiety, 
binding to target protein, and transferring the ubiquitin moiety into a covalent bond to the target protein. 
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16. A ubiquitin-conjugating E2 fusion protein comprising, from amino to carboxyl terminus: 
a core region derived from a ubiquitin-conjugating E2 protein and which is functionally able to form 

a thiol linkage to a ubiquitin protein by transesterif ication from an E1 protein; 
a spacer of at least four amino acids; and 

a heterologous protein binding ligand having specific binding affinity for at least one target protein, 
so that the entire E2 fusion protein will be capable of forming a thiol linkage to at least one ubiquitin moiety, 
binding to target protein, and transferring the ubiquitin moiety into a covalent bond to the target protein. 

17. A protein according to claim 16 which is encoded by a DNAas claimed in any one of claims 2 to 10. 

18. A non-human transgenic eukaryotic organism comprising the E2 fusion protein of claim 16. 

19. A non-human transgenic eukaryotic organism as claimed in claim 18 wherein the organism is a plant 

20. A method of causing the addition of a ubiquitin moiety to a target protein comprising the steps of 
15 (a) constructing a DNA sequence for a ubiquitin-conjugating E2 fusion protein including: 

a promoter effective in the cells of a host to express a protein coding sequence located 3' to the 
promoter; and 

a protein coding sequence located 3' to the promoter, the protein coding sequence including 5' to 

3': 

20 a DNA coding sequence encoding a ubiquitin-conjugating E2 core region; 

a DNA coding sequence encoding a spacer of at least four amino acids; and 
a DNA coding sequence encoding a heterologous protein binding ligand having affinity for the 
target protein; 

(b) transforming the DNA sequence from step (a) into a host in which the promoter is capable of causing 
25 expression of the E2 fusion protein, so that the E2 fusion protein is produced in the host; and 

(c) exposing the E2 fusion protein to ubiquitin-linked E1, and a source of energy so that the E2 fusion 
protein will accept the ubiquitin from the E1 and transfer the ubiquitin specifically to the target protein 
if present 

30 21. A method according to claim 20 wherein transformation is performed in a host containing the target protein 
and the fusion protein is exposed to the ubiquitin-linked E1 by in vivo expression of the E2 fusion protein 
in the cells of the host. 

22. A method according to claim 20 wherein the target protein is not in the host in which the fusion protein is 
35 expressed and the method further comprises recovering the E2 fusion protein expressed in the host and 

introducing the E2 fusion protein into a second host in which the target protein is present 

23. A method according to claim 20, 21 or 22 in which the degradation of a target protein is directed by ex- 
posing the E2 fusion protein to ubiquitin-linked E1 , a source of energy, and a proteosome proteolytic com- 
plex, so that the E2 fusion protein will accept the ubiquitin from the E1 and transfer the ubiquitin specif- 
ically to the target protein directing degradation of the target protein by the proteosome. 

24. A method of directing the degradation of a target protein in a host organism comprising the steps of 

(a) constructing a DNA sequence for a ubiquitin-conjugating E2 fusion protein including: 
a promoter effective in the cells of a host to express a protein coding sequence located 3' to the 

promoter; and 

a protein coding sequence located 3' to the promoter, the protein coding sequence including, 
5* to 3': 

a DNA coding sequence encoding a ubiquitin-conjugating E2 core region; 
a DNA coding sequence encoding a spacer of at least four amino acids; and 
a DNA coding sequence encoding a heterologous protein binding ligand having affinity for the 
target protein; and 

(b) transforming the DNA sequence from step (a) into the host organism so that the E2 fusion protein 
wi II add a ubiqu itin moiety to the target protein to direct the degradation of the target protein by a cellular 
protein degrading pathway. 

25. A method according to daim 24 wherein the host is a plant 
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26. A method of causing the degradation of a target protein in a target host comprising the steps of 

(a) constructing a DNA sequence for a ubiquitin-conjugating E2 fusion protein including: 

a promoter effective in the cells of a host to express a protein coding sequence located 3' to the 
promoter; and 

5 a protein coding sequence located 3' to the promoter, the protein coding sequence including, 

5' to 3': 

a DNA coding sequence encoding a ubiquitin-conjugating E2 core region; 
a DNA coding sequence encoding a spacer of at least four amino acids; and 
a DNA coding sequence encoding a heterologous protein binding ligand having affinity for the 
10 target protein; 

(b) transforming the DNA sequence from step (a) into a production host in which the promoter is capable 
of causing expression of the E2 fusion protein, so that the E2 fusion protein, so that the E2 fusion pro- 
tein is produced in the production host; 

(c) recovering the expressed E2 fusion protein from the production host; and 
15 (d) introducing the recovered E2 fusion protein into the target host. 

27. A method according to claim 26 wherein the target host is a mammal. 
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